Data Visualization

The d-Separation Criterion in Categorical Probability

Updated: 2023-01-31 16:29:40

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us The d-Separation Criterion in Categorical Probability Tobias Fritz , Andreas Klingler 24(46 1 49, 2023. Abstract The d-separation criterion detects the compatibility of a joint probability distribution with a directed acyclic graph through certain conditional independences . In this work , we study this problem in the context of categorical probability theory by introducing a categorical definition of causal models , a categorical notion of d-separation , and proving an abstract version of the d-separation criterion . This approach has two main benefits . First , categorical d-separation is a very intuitive

Robust Load Balancing with Machine Learned Advice

Updated: 2023-01-31 16:29:40

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Robust Load Balancing with Machine Learned Advice Sara Ahmadian , Hossein Esfandiari , Vahab Mirrokni , Binghui Peng 24(44 1 46, 2023. Abstract Motivated by the exploding growth of web-based services and the importance of efficiently managing the computational resources of such systems , we introduce and study a theoretical model for load balancing of very large databases such as commercial search engines . Our model is a more realistic version of the well-received bab model with an additional constraint that limits the number of servers that carry each piece of the data . This additional constraint is

Benchmarking Graph Neural Networks

Updated: 2023-01-31 16:29:40

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Benchmarking Graph Neural Networks Vijay Prakash Dwivedi , Chaitanya K . Joshi , Anh Tuan Luu , Thomas Laurent , Yoshua Bengio , Xavier Bresson 24(43 1 48, 2023. Abstract In the last few years , graph neural networks GNNs have become the standard toolkit for analyzing and learning from data on graphs . This emerging field has witnessed an extensive growth of promising techniques that have been applied with success to computer science , mathematics , biology , physics and chemistry . But for any successful field to become mainstream and reliable , benchmarks must be developed to quantify progress . This led us

Neural Implicit Flow: a mesh-agnostic dimensionality reduction paradigm of spatio-temporal data

Updated: 2023-01-31 16:29:40

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Neural Implicit Flow : a mesh-agnostic dimensionality reduction paradigm of spatio-temporal data Shaowu Pan , Steven L . Brunton , J . Nathan Kutz 24(41 1 60, 2023. Abstract High-dimensional spatio-temporal dynamics can often be encoded in a low-dimensional subspace . Engineering applications for modeling , characterization , design , and control of such large-scale systems often rely on dimensionality reduction to make solutions computationally tractable in real time . Common existing paradigms for dimensionality reduction include linear methods , such as the singular value decomposition SVD and nonlinear

Label Distribution Changing Learning with Sample Space Expanding

Updated: 2023-01-31 16:29:40

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Label Distribution Changing Learning with Sample Space Expanding Chao Xu , Hong Tao , Jing Zhang , Dewen Hu , Chenping Hou 24(36 1 48, 2023. Abstract With the evolution of data collection ways , label ambiguity has arisen from various applications . How to reduce its uncertainty and leverage its effectiveness is still a challenging task . As two types of representative label ambiguities , Label Distribution Learning LDL which annotates each instance with a label distribution , and Emerging New Class ENC which focuses on model reusing with new classes , have attached extensive attentions . Nevertheless , in

Gap Minimization for Knowledge Sharing and Transfer

Updated: 2023-01-31 16:29:40

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Gap Minimization for Knowledge Sharing and Transfer Boyu Wang , Jorge A . Mendez , Changjian Shui , Fan Zhou , Di Wu , Gezheng Xu , Christian Gagné , Eric Eaton 24(33 1 57, 2023. Abstract Learning from multiple related tasks by knowledge sharing and transfer has become increasingly relevant over the last two decades . In order to successfully transfer information from one task to another , it is critical to understand the similarities and differences between the domains . In this paper , we introduce the notion of performance gap , an intuitive and novel measure of the distance between learning tasks . Unlike

Sparse PCA: a Geometric Approach

Updated: 2023-01-31 16:29:40

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Sparse PCA : a Geometric Approach Dimitris Bertsimas , Driss Lahlou Kitane 24(32 1 33, 2023. Abstract We consider the problem of maximizing the variance explained from a data matrix using orthogonal sparse principal components that have a support of fixed cardinality . While most existing methods focus on building principal components PCs iteratively through deflation , we propose GeoSPCA , a novel algorithm to build all PCs at once while satisfying the orthogonality constraints which brings substantial benefits over deflation . This novel approach is based on the left eigenvalues of the covariance matrix

Labels, Information, and Computation: Efficient Learning Using Sufficient Labels

Updated: 2023-01-31 16:29:40

, , : Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Labels , Information , and Computation : Efficient Learning Using Sufficient Labels Shiyu Duan , Spencer Chang , Jose C . Principe 24(31 1 35, 2023. Abstract In supervised learning , obtaining a large set of fully-labeled training data is expensive . We show that we do not always need full label information on every single training example to train a competent classifier . Specifically , inspired by the principle of sufficiency in statistics , we present a statistic a summary of the fully-labeled training set that captures almost all the relevant information for classification but at the same time is

HiClass: a Python Library for Local Hierarchical Classification Compatible with Scikit-learn

Updated: 2023-01-31 16:29:40

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us HiClass : a Python Library for Local Hierarchical Classification Compatible with Scikit-learn Fábio M . Miranda , Niklas Köhnecke , Bernhard Y . Renard 24(29 1 17, 2023. Abstract HiClass is an open-source Python library for local hierarchical classification entirely compatible with scikit-learn . It contains implementations of the most common design patterns for hierarchical machine learning models found in the literature , that is , the local classifiers per node , per parent node and per level . Additionally , the package contains implementations of hierarchical metrics , which are more appropriate for

The SKIM-FA Kernel: High-Dimensional Variable Selection and Nonlinear Interaction Discovery in Linear Time

Updated: 2023-01-31 16:29:40

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us The SKIM-FA Kernel : High-Dimensional Variable Selection and Nonlinear Interaction Discovery in Linear Time Raj Agrawal , Tamara Broderick 24(27 1 60, 2023. Abstract Many scientific problems require identifying a small set of covariates that are associated with a target response and estimating their effects . Often , these effects are nonlinear and include interactions , so linear and additive methods can lead to poor estimation and variable selection . Unfortunately , methods that simultaneously express sparsity , nonlinearity , and interactions are computationally intractable with runtime at least

Generalization Bounds for Noisy Iterative Algorithms Using Properties of Additive Noise Channels

Updated: 2023-01-31 16:29:40

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Generalization Bounds for Noisy Iterative Algorithms Using Properties of Additive Noise Channels Hao Wang , Rui Gao , Flavio P . Calmon 24(26 1 43, 2023. Abstract Machine learning models trained by different optimization algorithms under different data distributions can exhibit distinct generalization behaviors . In this paper , we analyze the generalization of models trained by noisy iterative algorithms . We derive distribution-dependent generalization bounds by connecting noisy iterative algorithms to additive noise channels found in communication and information theory . Our generalization bounds shed

Discrete Variational Calculus for Accelerated Optimization

Updated: 2023-01-31 16:29:40

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Discrete Variational Calculus for Accelerated Optimization Cédric M . Campos , Alejandro Mahillo , David Martín de Diego 24(25 1 33, 2023. Abstract Many of the new developments in machine learning are connected with gradient-based optimization methods . Recently , these methods have been studied using a variational perspective Betancourt et al . 2018 This has opened up the possibility of introducing variational and symplectic methods using geometric integration . In particular , in this paper , we introduce variational integrators Marsden and West , 2001 which allow us to derive different methods for

Calibrated Multiple-Output Quantile Regression with Representation Learning

Updated: 2023-01-31 16:29:40

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Calibrated Multiple-Output Quantile Regression with Representation Learning Shai Feldman , Stephen Bates , Yaniv Romano 24(24 1 48, 2023. Abstract We develop a method to generate predictive regions that cover a multivariate response variable with a user-specified probability . Our work is composed of two components . First , we use a deep generative model to learn a representation of the response that has a unimodal distribution . Existing multiple-output quantile regression approaches are effective in such cases , so we apply them on the learned representation , and then transform the solution to the original

Bayesian Data Selection

Updated: 2023-01-31 16:29:40

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Bayesian Data Selection Eli N . Weinstein , Jeffrey W . Miller 24(23 1 72, 2023. Abstract Insights into complex , high-dimensional data can be obtained by discovering features of the data that match or do not match a model of interest . To formalize this task , we introduce the data selection problem : finding a lower-dimensional statistic such as a subset of variables that is well fit by a given parametric model of interest . A fully Bayesian approach to data selection would be to parametrically model the value of the statistic , nonparametrically model the remaining background components of the data , and

Graph-Aided Online Multi-Kernel Learning

Updated: 2023-01-31 16:29:40

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Graph-Aided Online Multi-Kernel Learning Pouya M . Ghari , Yanning Shen 24(21 1 44, 2023. Abstract Multi-kernel learning MKL has been widely used in learning problems involving function learning tasks . Compared with single kernel learning approach which relies on a pre-selected kernel , the advantage of MKL is its flexibility results from combining a dictionary of kernels . However , inclusion of irrelevant kernels in the dictionary may deteriorate the accuracy of MKL , and increase the computational complexity . Faced with this challenge , a novel graph-aided framework is developed to select a subset of

Regularized Joint Mixture Models

Updated: 2023-01-31 16:29:40

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Regularized Joint Mixture Models Konstantinos Perrakis , Thomas Lartigue , Frank Dondelinger , Sach Mukherjee 24(19 1 47, 2023. Abstract Regularized regression models are well studied and , under appropriate conditions , offer fast and statistically interpretable results . However , large data in many applications are heterogeneous in the sense of harboring distributional differences between latent groups . Then , the assumption that the conditional distribution of response Y$ given features X$ is the same for all samples may not hold . Furthermore , in scientific applications , the covariance structure of the

Globally-Consistent Rule-Based Summary-Explanations for Machine Learning Models: Application to Credit-Risk Evaluation

Updated: 2023-01-31 16:29:40

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Globally-Consistent Rule-Based Summary-Explanations for Machine Learning Models : Application to Credit-Risk Evaluation Cynthia Rudin , Yaron Shaposhnik 24(16 1 44, 2023. Abstract We develop a method for understanding specific predictions made by global predictive models by constructing local models tailored to each specific observation these are also called explanations in the literature Unlike existing work that explains” specific observations by approximating global models in the vicinity of these observations , we fit models that are globally-consistent with predictions made by the global model on past

Python package for causal discovery based on LiNGAM

Updated: 2023-01-31 16:29:40

Causal discovery is a methodology for learning causal graphs from data, and LiNGAM is a well-known model for causal discovery. This paper describes an open-source Python package for causal discovery based on LiNGAM. The package implements various LiNGAM methods under different settings like time series cases, multiple-group cases, mixed data cases, and hidden common cause cases, in addition to evaluation of statistical reliability and model assumptions. The source code is freely available under the MIT license at https://github.com/cdt15/lingam.

Sampling random graph homomorphisms and applications to network data analysis

Updated: 2023-01-31 16:29:40

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Sampling random graph homomorphisms and applications to network data analysis Hanbaek Lyu , Facundo Memoli , David Sivakoff 24(9 1 79, 2023. Abstract A graph homomorphism is a map between two graphs that preserves adjacency relations . We consider the problem of sampling a random graph homomorphism from a graph into a large network . We propose two complementary MCMC algorithms for sampling random graph homomorphisms and establish bounds on their mixing times and the concentration of their time averages . Based on our sampling algorithms , we propose a novel framework for network data analysis that circumvents

AutoKeras: An AutoML Library for Deep Learning

Updated: 2023-01-31 16:29:40

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us AutoKeras : An AutoML Library for Deep Learning Haifeng Jin , François Chollet , Qingquan Song , Xia Hu 24(6 1 6, 2023. Abstract To use deep learning , one needs to be familiar with various software tools like TensorFlow or Keras , as well as various model architecture and optimization best practices . Despite recent progress in software usability , deep learning remains a highly specialized occupation . To enable people with limited machine learning and programming experience to adopt deep learning , we developed AutoKeras , an Automated Machine Learning AutoML library that automates the process of model

Cluster-Specific Predictions with Multi-Task Gaussian Processes

Updated: 2023-01-31 16:29:40

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Cluster-Specific Predictions with Multi-Task Gaussian Processes Arthur Leroy , Pierre Latouche , Benjamin Guedj , Servane Gey 24(5 1 49, 2023. Abstract A model involving Gaussian processes GPs is introduced to simultaneously handle multitask learning , clustering , and prediction for multiple functional data . This procedure acts as a model-based clustering method for functional data as well as a learning step for subsequent predictions for new tasks . The model is instantiated as a mixture of multi-task GPs with common mean processes . A variational EM algorithm is derived for dealing with the optimisation of

Efficient Structure-preserving Support Tensor Train Machine

Updated: 2023-01-31 16:29:40

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Efficient Structure-preserving Support Tensor Train Machine Kirandeep Kour , Sergey Dolgov , Martin Stoll , Peter Benner 24(4 1 22, 2023. Abstract An increasing amount of the collected data are high-dimensional multi-way arrays tensors and it is crucial for efficient learning algorithms to exploit this tensorial structure as much as possible . The ever present curse of dimensionality for high dimensional data and the loss of structure when vectorizing the data motivates the use of tailored low-rank tensor classification methods . In the presence of small amounts of training data , kernel methods offer an

Bayesian Spiked Laplacian Graphs

Updated: 2023-01-31 16:29:40

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Bayesian Spiked Laplacian Graphs Leo L Duan , George Michailidis , Mingzhou Ding 24(3 1 35, 2023. Abstract In network analysis , it is common to work with a collection of graphs that exhibit heterogeneity . For example , neuroimaging data from patient cohorts are increasingly available . A critical analytical task is to identify communities , and graph Laplacian-based methods are routinely used . However , these methods are currently limited to a single network and also do not provide measures of uncertainty on the community assignment . In this work , we first propose a probabilistic network model called the

Approximation Bounds for Hierarchical Clustering: Average Linkage, Bisecting K-means, and Local Search

Updated: 2023-01-31 16:29:40

: , , Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Approximation Bounds for Hierarchical Clustering : Average Linkage , Bisecting K-means , and Local Search Benjamin Moseley , Joshua R . Wang 24(1 1 36, 2023. Abstract Hierarchical clustering is a data analysis method that has been used for decades . Despite its widespread use , the method has an underdeveloped analytical foundation . Having a well-understood foundation would both support the currently used methods and help guide future improvements . The goal of this paper is to give an analytic framework to better understand observations seen in practice . This paper considers the dual of a problem

CNN’s Harry Enten: Data for TV

Updated: 2023-01-30 15:00:00

Harry Enten, pic: CNN Harry Enten (@forecasterenten) is one of the most-high-profile data journalists in the world. He explains the numbers every day on CNN – whether it’s election polling, sports or even his original passion: meterology, specifically snowstorms. “I definitely see myself as a storyteller,” says Enten and he chats with Alberto and Simon about his … Continue reading →

✚ Visualization Tools and Learning Resources, January 2023 Roundup

Updated: 2023-01-26 19:30:02

, Membership Projects Courses Tutorials Newsletter Become a Member Log in Members Only Visualization Tools and Learning Resources , January 2023 Roundup January 26, 2023 Topic The Process roundup Welcome to issue 223 of The Process the newsletter where we look closer at how the charts get made . I’m Nathan Yau , and every month I collect useful tools and resources to help you make better charts . Here’s the good stuff for . January To access this issue of The Process , you must be a . member If you are already a member , log in here Join Now The Process is a weekly newsletter where I evaluate how visualization tools , rules , and guidelines work in practice . I publish every Thursday . Get it in your inbox or access it via the site . You also gain unlimited access to hundreds of hours

Misuse of the rainbow color scheme to visualize scientific data

Updated: 2023-01-26 08:49:45

Membership Projects Courses Tutorials Newsletter Become a Member Log in Misuse of the rainbow color scheme to visualize scientific data January 26, 2023 Topic Design color nature rainbow science Fabio Crameri , Grace Shephard , and Philip Heron in Nature discuss the drawbacks of using the rainbow color scheme to visualize data and more readable : alternatives The accurate representation of data is essential in science communication . However , colour maps that visually distort data through uneven colour gradients or are unreadable to those with colour-vision deï¬ciency remain prevalent in science . These include , but are not limited to , rainbow-like and redâgreen colour maps . Here , we present a simple guide for the scientiï¬c use of colour . We show how scientiï¬cally derived

Cinematic visualization

Updated: 2023-01-25 10:01:11

Membership Projects Courses Tutorials Newsletter Become a Member Log in Cinematic visualization January 25, 2023 Topic Design 3-d cinematic narrative research Using the third dimension in visualization can be tricky because of rendering , perception , and presentation . Matthew Conlen , Jeffrey Heer , Hillary Mushkin , and Scott Davidoff provide a strong use case in their paper on what they call cinematic visualization The many genres of narrative visualization e.g . data comics , data videos each offer a unique set of affordances and constraints . To better understand a genre that we call cinematic visualizationsâ3D visualizations that make highly deliberate use of a camera to convey a narrativeâwe gathered 50 examples and analyzed their traditional cinematic aspects to identify the

✚ How to Animate Packed Circles in R

Updated: 2023-01-25 00:48:59

Membership Projects Courses Tutorials Newsletter Become a Member Log in Members Only Tutorials animation R How to Animate Packed Circles in R By Nathan Yau Pack circles , figure out the transitions between time segments , and then generate frames to string . together To animate packed circles , I usually use JavaScript but I’ve been playing with the packcircles package in R . It doesn’t have an animation option , but I was curious how to make things . move To access this full tutorial , you must be a . member If you are already a member , log in here Get instant access to this tutorial and hundreds more , plus courses , guides , and additional . resources Become a Member Membership You will get unlimited access to step-by-step visualization courses and tutorials for insight and

Decomposition Tree Now in Qlik Sense

Updated: 2023-01-24 11:30:59

Happy day, data analysts using Qlik! We are thrilled to announce the release of our groundbreaking Decomposition Tree extension for Qlik Sense! Previously unavailable in Qlik natively or in a third-party extension, a Decomposition Tree is an incredibly powerful technique. It allows you to intuitively explore your core metrics across a number of dimensions, quickly […] The post Decomposition Tree Now in Qlik Sense appeared first on AnyChart News.

Deluxe Combo Chart & Versatile Circular Gauge for Qlik Sense

Updated: 2023-01-24 11:25:03

In addition to the Decomposition Tree and new Gantt Chart features, we are excited to release two astonishing extensions. Enjoy making sense of your metrics using the brand new approaches you never had in Qlik before — with our Deluxe Combo Chart and Versatile Circular Gauge for Qlik Sense! Now, join us for a quick […] The post Deluxe Combo Chart & Versatile Circular Gauge for Qlik Sense appeared first on AnyChart News.

New Progress Tracking Features for Gantt Charts in Qlik Sense

Updated: 2023-01-24 11:21:05

Tracking project progress using Gantt charts in Qlik Sense has become even easier with the latest update of our dedicated extension! Learn about the just-released features and improvements. Then update to the newest version of AnyGantt for Qlik and check them out in action! Read more at qlik.anychart.com » The post New Progress Tracking Features for Gantt Charts in Qlik Sense appeared first on AnyChart News.

Visualizing Text Data Hierarchy with Word Trees

Updated: 2023-01-19 23:15:47

Over the past few weeks, I have been looking for a quick and effective way of representing the structural differences within a set of similar-looking short sentences. To provide a bit of context, as we approached the end of 2022, my workmates and I got heavily involved in a planning phase for the new year […] The post Visualizing Text Data Hierarchy with Word Trees appeared first on AnyChart News.

Exciting Visual Graphics That Tell Stories — DataViz Weekly

Updated: 2023-01-13 11:35:42

DataViz Weekly is our regular blog feature where we curate the most exciting charts, maps, and infographics we’ve recently come across. Today, we want to attract your attention to some cool visual stories published out there near the end of the last year, which we did not get a chance to spotlight before: Animal species […] The post Exciting Visual Graphics That Tell Stories — DataViz Weekly appeared first on AnyChart News.

2022 Year in Data Visualizations — DataViz Weekly

Updated: 2023-01-06 19:52:41

Finally we are in 2023! May this new year be the best one for all of you! Before getting too far into 2023, we thought it would be interesting to look back at 2022 in data visualizations. And the first DataViz Weekly in the new year seems like a nice occasion! Let’s say farewell to […] The post 2022 Year in Data Visualizations — DataViz Weekly appeared first on AnyChart News.

Data Visualization

Exploring ways to display data

Current Feed Items | Previous Months ItemsDec 2022 | Nov 2022

Current Feed Items | Previous Months Items

Get Feed

Sources

24 - JMLR

6 - AnyChart News

4 - FlowingData

1 - Simon Rogers

Current Feed Items | Previous Months Items
Dec 2022 | Nov 2022